University of South Florida
git bash for windows from:Ctrl + Shift + PImportant
You should also create an account on GitHub. Optionally, register with USF for a student previleges.
Check if terminal understands git command. Type git on terminal and confirm no error message.
Make sure you are able to login to Github.
Distributed version control system, created by Linus Torvalds (2005)
“Track changes” made by team members, merge into main, etc.
Considered a “must” for software development, and many data science projects.
Has a learning curve, but it’s worth learning even for solo projects.
An online hosting platform that is based on Git system
Easy to browse other repositories (public, free)
Others: GitLab, Bitbucket, GitBucket, etc…
Version control: track changes, revert to previous versions
Collaboration: multiple people can work on the same project
Backup: store your project on the cloud
Portfolio: showcase your work to potential employers
Open source: contribute to other projects
Nope.
Git can be done completely locally (w/o internet).
GitHub deploys Git to online cloud system.
Git is primarily designed to handle text files and tracks changes line by line.
It can upload binary files (e.g. images, pdfs), but doesn’t track differences like text files.
GitHub has a small file size limit (100MB)
It is intended for code tracking. Large data should be stored elsewhere.
git init is used to initialize a new git repository from current working directory.
git status shows the status of changes as untracked, modified, or staged.
git status provides information about the state of your working directory and staging area. Here are the key terms:
Untracked files: Files that Git isn’t tracking yet.
Tracked files: Files that Git has been monitoring for changes.
git add).git log shows the history of commits and its corresponding hashes.
:q to quit!git add stages files for commit.
git commit creates your project’s version, or a hash block.
You should provide message folling -m, that describes the version.
Go to your folder from terminal.
In the Shell folder, create a text file For_first_commit.sh.
In the file, add text “Learing Git is Fun!”
Get out of Shell folder, and back to your class folder.
Now initiate git in your class folder.
git status. What do you see?git add . to stage all untracked files and folders.Now let’s make a commit.
For commit message, use “My first commit”.
When you want to get back to previous commit (version).
HEAD is a pointer where you are currently at.HEAD~1 means 1 commit before current hash (version).git reset Bgit reset --mixed Bgit reset --soft Bgit reset --hard BGo to your folder from terminal.
In the Shell folder, edit For_first_commit.sh file:
Get out of Shell folder, and back to your class folder.
Then check the status. What do you see?
Remember this status. We will come back to here after reset.
Add all changes to the staging area.
Commit with message “Test second commit”
Check the status.
Check the commit logs with git log or git log --oneline.
git reset HEAD~1git reset (hash)git add .git reset --hard HEAD~1.The process so far is only for your local version management, which is also completely fine for local work.
However, you can also make your repository “Synchronized” to a cloud server: Github.
git remote add adds remote repository.
origin is naming convention for Github remote repo.main is your local branch name.git push sends your commit to the remote repository.
git pull does two operations altogether:
Create public repo
Name your repository nicely
DO NOT ADD README FILE (leave it unchecked)
Copy the address somewhere
What if you wanted to download someone else’s remote repository to your local machine? Git clone is for the first time downloading from the remote repository.
Check your branches from current directory with
Browse your branches now again, and check the differences
Browse your github site, and see files are uploaded.
https://github.com/yourGithubId/Reponame.git
Note
By default, git will only track files, not empty folders. Folders are tracked when files are in it.
Now, make changes to your R\>my_first_R_code.R file:
Write and modify the file: print("Hello world!") in the code and save.
Then add, commit, and push to the github.
Browse history with:
Now, suppose your second commit (version) was an error.
You can come back to previous commit by:
or
At now, ONLY your local computer is back to the time when before the second commit. Your Github remote is in the future status (ahead) yet!
If you want your remote to be back to the first commit as well:
Note that you should use --force option here.
For future updates, use
When using github from other apps (e.g. Positron, VScode, etc), you’ll use PAT (personal access token) instead of password.
Tracked vs Untracked
Staged vs Unstaged
You create a new file example.txt → Untracked
Running git add example.txt stages the file → Staged
After modifying example.txt, it becomes unstaged until you run git add example.txt again
By default, git pull is
git pull has three methods:
git pull --merge (default)
git pull --rebase
git pull --ff-only
git mergeCombines changes into a new merge commit. You will need to combine and resolve conflicts.
Content of file.txt
Alice edits the second line and commits and push to main first:
Bob also edits the same line, but did not pull Alice’s changes. He edits and commits locally:
When Bob tries to push changes with git push, Git will reject because his local branch is behind the remote.
Bob runs git pull which is git fetch and git merge: git shows the conflicts
The file.txt now contains conflict markers
To resolve conflict, Bob should modify the file manually.
Bob stages this file and commits the resolution:
After resolving, Bob pushes with git push.
git rebaseGit replays your local branch on top of the remote branch, creating a linear history:
Content of file.txt
Alice edits the second line and commits and push to main first:
Bob also edits the same line, but did not pull Alice’s changes. He edits and commits locally:
When Bob tries to push changes with git push, Git will reject because his local branch is behind the remote.
Bob runs git pull --rebase which is git fetch and git rebase:
Instead of creating a merge commit,
git rewinds Bob’s local commit C
applies Alice’s changes from remote (origin/main) B
reapplies C on top of B
When there’s no conflict, (i.e., they did not change the same line of code) git pull –-rebase won’t raise any conflict message.
However, in our scenario, there’s conflict since both modified the same line, so:
rebase is halted
should be continued after conflict resolution
Similar to merge, conflict should be resolved manually.
The file.txt now contains conflict markers
To resolve conflict, Bob should modify the file manually.
Bob stages this file and continues the rebase (no commit)
After that Bob pushes with git push. The history is linear.
git merge --ff-onlyUsed when you expect no conflicts.
When you want linear history and want to move your local branch pointer forward without modifying.
Git keep tracks all changes within the project directory
New file, folder
Modifications (changes or updates)
Deletion
If there are files/folder you don’t want to track, specify them in .gitignore file.
If you want to remove untracked (not unstaged!) files:
FIN4770: Programming for FinTech